Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement parallelization options as explicit inputs. #554

Merged
merged 2 commits into from
Nov 23, 2020

Conversation

greschd
Copy link
Member

@greschd greschd commented Sep 18, 2020

Fixes #405.

If any parallelization option is already specified in the settings['CMDLINE'], a DeprecationWarning is issued. However, the value from settings['CMDLINE'] still takes precedence. This allows us to implement the defaults at the level of the input
port, without having to care about it later on.

The BasePwCpInputGenerator is given three class attributes:

  • _ALLOWED_PARALLELIZATION_FLAGS is a list of tuples (flag_name, default, help), of all possible flags
  • _ENABLED_PARALLELIZATION_FLAGS is a list of flag names that are implemented in a particular code.
  • _PARALLELIZATION_FLAG_ALIASES is used to detect all possible variations on a flag name.

TODO:

  • Testing
  • Documentation: Not sure where best to put this - the auto-generated help will be there in any case.
  • Check if flag names are case sensitive. If not, this needs to be taken into account when checking for existing flags. Update: If flags are capitalized differently (e.g. -nPools) QE will simply ignore them.
  • Set _ENABLED_PARALLELIZATION_FLAGS for codes other than pw.x. Help needed: I have no idea what flags other codes support.
  • PW accepts -nimage, but I'm not sure if passing that makes any sense (a PW calculation only treats one "image"). If it doesn't, we should probably disable nimage in the PwCalculation. Update: Disallowed -nimage for pw.x for now - if there is in fact a use case, we can always re-enable it.

@greschd greschd force-pushed the parallelization_inputs branch 2 times, most recently from 122374f to 40aaf0e Compare September 18, 2020 22:04
@greschd
Copy link
Member Author

greschd commented Sep 18, 2020

Side note: I noticed the # yapf: disable statement in the define classmethod applies to the whole file after that, not just define. I doubt that's intentional, but should probably be fixed in a separate PR such as to not clutter the changes.

@greschd
Copy link
Member Author

greschd commented Sep 18, 2020

@sphuber this is ready for review, but not quite for merging yet.

@greschd greschd force-pushed the parallelization_inputs branch 2 times, most recently from 1a41823 to 0f92866 Compare September 22, 2020 14:10
@greschd
Copy link
Member Author

greschd commented Sep 22, 2020

Aliases are taken from https://gitlab.com/QEF/q-e/-/blob/develop/Modules/command_line_options.f90#L93

The choice of which alias is used in the interface is somewhat arbitrary - now it's the one that matches the internal variable name in QE.

@greschd greschd force-pushed the parallelization_inputs branch 2 times, most recently from 2c1962c to 5e43d1e Compare October 8, 2020 10:36
@sphuber
Copy link
Contributor

sphuber commented Oct 8, 2020

Side note: I noticed the # yapf: disable statement in the define classmethod applies to the whole file after that, not just define. I doubt that's intentional, but should probably be fixed in a separate PR such as to not clutter the changes.

Is that a bug in yapf or intended behavior? I thought those statements only applied to the scope to which they were applied. So I assumed that was only being applied to the define method. How did you discover this? And where should it be placed then? On the line of method definition?

# in QE codes. The flags that are actually implemented in a given
# code should be specified in the '_ENABLED_PARALLELIZATION_FLAGS'
# list of each calculation subclass.
_ALLOWED_PARALLELIZATION_FLAGS = ( # yapf: disable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things:

  1. Find the prefix "allowed" a bit weird. Why not just _PARALLELIZATION_FLAGS?
  2. Why not make this a dictionary with the flag name as the key? I think you only use it in line 125 and there you transform it in a dict. Might as well define it as one then and save those lines

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good point, will change that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

# bind to the name `default`, and look up its value only when the lambda is
# actually called. Because `default` changes throughout the for-loop, this
# would give the wrong value.
extra_kwargs = {'default': partial(orm.Int, default)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about specifying these defaults. I think this would mean that all these calculations will now always have a bunch of int nodes as inputs even if the user doesn't specify anything. I am not sure we want this, at least not until we have proper automatic node deduplication in the database. If we still need the defaults to be set for consistency, we can maybe just do this on prepare_for_submission of the base class, but without going through input nodes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added these defaults mostly for caching: Two calculations could otherwise have different inputs but be functionally equivalent. They wouldn't use caching, just because the inputs are different.

I'll leave it up to you to decide if that's an important reason - I can also see the argument that the extra input nodes are unnecessary. Otherwise I can just remove the defaults.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, didn't think about the caching. Then again, maybe if we miss a cache hit because of the parallelization flags is not too bad as it should not occur too often. That reminds me though: didn't you mention somewhere that the actual defaults in QE are a bit surprising, i.e. not all 1 or something like that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The QE defaults are 1 for all inputs except ndiag - that's why we don't have a default for that here. QE determines it based on the number of available processors and some of the other parallelization levels.

Should I remove the defaults for the other parallelization options, then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I remove the defaults for the other parallelization options, then?

I think you have already done this haven't you? All defaults in _PARALLELIZATION_FLAGS are set to None which means that required=False will be set on the corresponding port. This means that no node will be generated unless one is passed explicitly of course, which is the behavior I was suggesting. So i think this now respects that, or am I missing something?

Copy link
Member Author

@greschd greschd Oct 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I'm asking whether the code implementing defaults should also be removed. Essentially, the default is not None code path is dead now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default is not None code path is dead now.

Not quite. A subclass could still override _PARALLELIZATION_FLAGS to define default for one or more flags and automatically get the spec to work correctly. Not sure if this would be useful, but could be the case. So maybe keeping the code is ok. That being said, what is this MappingProxyType and why do you use that to create the class attribute?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fine with me to leave it.

The MappingProxyType is an immutable wrapper around a dictionary. If nobody has a reference to the underlying dictionary, the object as a whole is immutable (at least, unless someone goes digging deep for a way to change it).

Here I want it to be immutable because otherwise it's easy to mess up the base class by modifying _PARALLELIZATION_FLAGS on a child class.

Comment on lines 67 to 251
inputs['settings'] = Dict(dict={'CMDLINE': ['-{}'.format(flag_name), '2']})
with pytest.warns(DeprecationWarning):
generate_calc_job(fixture_sandbox, entry_point_name, inputs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe still check here that the value of the settings is kept and not overridden?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@greschd
Copy link
Member Author

greschd commented Oct 8, 2020

Thanks for the review @sphuber, I'll make the changes tomorrow. Do you think we should aim to get this into the release?

@sphuber
Copy link
Contributor

sphuber commented Oct 8, 2020

Thanks for the review @sphuber, I'll make the changes tomorrow. Do you think we should aim to get this into the release?

Honestly, since this is a bit of an experiment, would it maybe make sense to not include it so that we have it in develop for some time and can get others to test it to see if they like it?

@greschd
Copy link
Member Author

greschd commented Oct 8, 2020

Sure, I don't mind either way.

@greschd
Copy link
Member Author

greschd commented Oct 9, 2020

Side note: I noticed the # yapf: disable statement in the define classmethod applies to the whole file after that, not just define. I doubt that's intentional, but should probably be fixed in a separate PR such as to not clutter the changes.

Is that a bug in yapf or intended behavior? I thought those statements only applied to the scope to which they were applied. So I assumed that was only being applied to the define method. How did you discover this? And where should it be placed then? On the line of method definition?

I'm not sure, but I'd lean towards a yapf bug. I discovered it when adding a # yapf: enable statement somewhere else (reversing the effect of that disable statement), and it ended up reformatting a large part of the file.

@greschd greschd force-pushed the parallelization_inputs branch 2 times, most recently from dc05900 to 63a28c5 Compare October 9, 2020 08:39
@greschd
Copy link
Member Author

greschd commented Oct 9, 2020

Alright, I've addressed the comments above. One thing to still decide is whether we want to keep the ability to set defaults on the parallelization options.

If not, I would remove the corresponding code. I think it's still prudent to give the parallelization options set via settings['CMDLINE'] higher precedence though, but only because it's difficult to reliably strip them, since e.g. ['-ndiag', '', '2'] is also technically valid.

Note that I've removed the py3.5 test because we're targeting post-release, so I went with py3.6 compatibility only. I will rebase after we removed py3.5 compatibility.

@greschd greschd force-pushed the parallelization_inputs branch 3 times, most recently from 0c1e2ee to 333ce3c Compare October 13, 2020 14:17
@giovannipizzi
Copy link
Member

A few comments on my side:

  • ok to keep the settings higher priority for now; however I'm not sure ['-ndiag', '', '2'] is equivalent, when creating the submissions script we escape everything so I think that QE really gets an empty string for the value after -ndiag (if then fortran is weird and skips it, I don't know)

  • is there a strong reason to use different ports (in a namespace) instead of a parallelisation Dict input node, and do validation later? I agree having multiple ports helps for having defaults, docs etc., but exactly because every code might be using different parallelisation options, and these might change between QE versions (see e.g. -npools, -nks, -nk are (or used to be?) all equivalent I think), isn't it better to just put the complex logic in a single python file?

@greschd
Copy link
Member Author

greschd commented Oct 15, 2020

A few comments on my side:

* ok to keep the settings higher priority for now; however I'm not sure `['-ndiag', '', '2']` is equivalent, when creating the submissions script we escape everything so I think that QE really gets an empty string for the value after `-ndiag` (if then fortran is weird and skips it, I don't know)

Hm yeah, I have to admit I didn't test this. The point I was trying to make is that I'm not sure I'd get "stripping out" the explicitly passed parameters right, exactly because it's not very clear how these edge cases are handled.

* is there a strong reason to use different ports (in a namespace) instead of a `parallelisation` Dict input node, and do validation later? I agree having multiple ports helps for having defaults, docs etc., but exactly because every code might be using different parallelisation options, and these might change between QE versions (see e.g. `-npools`, `-nks`, `-nk` are (or used to be?) all equivalent I think), isn't it better to just put the complex logic in a single python file?

No particularly strong reason. In #405 I proposed both variants. The advantage of separate inputs is that it plays nicer with defaults:

  • it's not necessary to handle the defaults / missing parameters in code
  • works nicer with caching: explicitly specifying the default value is the same as not specifying anything
    Since we now don't specify the defaults anyway, I'm not sure how much weight that argument carries.

The "equivalent aliases" are handled: Only one variant is allowed, see the _PARALLELIZATION_FLAG_ALIASES.

As far as different QE versions go: To be honest, I hadn't thought of that. But I think it wouldn't be too hard to add. Since we would still want to support "older" QE versions, at the most we will have to add new parallelization flags. We can always add a validation that checks if the used flags are compatible with the used QE version. We'd first have to introduce an input for passing the QE version, but that's the same for both variants.

If there's preference for the Dict version, I can switch that around, no problem.

@greschd
Copy link
Member Author

greschd commented Nov 9, 2020

If there's preference for the Dict version, I can switch that around, no problem.

@sphuber @giovannipizzi what's your preference here?

Honestly, since this is a bit of an experiment, would it maybe make sense to not include it so that we have it in develop for some time and can get others to test it to see if they like it?

Would soon-ish be a good time to push it into develop to get some exposure, or do you have a release planned soon?

@sphuber
Copy link
Contributor

sphuber commented Nov 10, 2020

I am planning a release soon which is need for the common workflows project. Not sure if that should stop us though. I am not sure how many people are actually working of develop and are willing to experiment with it. I have to go now, but will try to come back to this later today and see what I think we should do

@sphuber
Copy link
Contributor

sphuber commented Nov 10, 2020

I think going with the single Dict node is preferable for now:

  • I am not sure we can justify creating around 4 additional base type nodes for each PwCalculation. Especially by default.
  • We could still provide a default for the Dict node, specifying a dict with all flags set to 1 except ndiag because it is not necessarily 1. This would essentially solve the caching problem wouldn't it? The ndiag is more or less included indirectly through the options where the number of procs per cpu and number of machines are defined. I am not sure if this is fully correct, because the default number of procs is set on the computer and maybe can be changed. That all being said, I am not sure we should create this node by default. I am not sure how many people are actually specifying parallelization flags currently and so potentially we will be adding a lot of extra nodes. If that is just to help with the unlikely caching hit, I am not sure if that is worth it.
  • You mentioned that the individual nodes is also useful to automatically deal with default handling, but you still need to add code to the prepare_for_submission right? If we go to a single optional dict, you can simply add the handling code there, couldn't you? Or am I missing something.

TLDR: I think for now we should just be conservative and make a single optional Dict node called parallelization.

@greschd
Copy link
Member Author

greschd commented Nov 10, 2020

Alright, got it! I'll make a single Dict, but without a default. In that case we can also raise instead of just warning when it's specified explicitly and in the CMDLINE.

@greschd greschd force-pushed the parallelization_inputs branch 2 times, most recently from 53516e4 to f8a6244 Compare November 11, 2020 20:01
@greschd
Copy link
Member Author

greschd commented Nov 11, 2020

Alright, switched the implementation to a single Dict input. The main TODOs now are documentation (where should I put this?) and adding _ENABLED_PARALLELIZATION_FLAGS to other codes (I don't know which flags each code takes..).

@greschd
Copy link
Member Author

greschd commented Nov 18, 2020

Note to self: the (largely undocumented) -n_howmany flag is missing from this PR.

Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @greschd , apart from some minor requests, this is ready to be merged. I don't think we need to specify the _ENABLED_PARALLELIZATION_FLAGS for any other plugins yet. We can check how this goes for pw.x and then apply elsewhere when needed and we figure out how it works. As for the docs, those are a bit of a mess. If you can add a section anywhere close to the PwCalculation for now, at some point hopefully we can work on the docs with @mbercx and we will incorporate it in the correct place then

aiida_quantumespresso/calculations/__init__.py Outdated Show resolved Hide resolved
aiida_quantumespresso/calculations/__init__.py Outdated Show resolved Hide resolved
@greschd greschd force-pushed the parallelization_inputs branch 3 times, most recently from b910b7f to d2edbd7 Compare November 20, 2020 09:54
@greschd greschd requested a review from sphuber November 20, 2020 09:55
@greschd greschd marked this pull request as ready for review November 20, 2020 09:56
Implements the parallelization options as a single `Dict`
given in the `parallelization` input.

If any parallelization option is specified twice (either
both in `parallization` and in `settings['CMDLINE']`, or
more than once in `settings['CMDLINE']`), an
`InputValidationError` is raised.
If the flag is given only in `settings['CMDLINE']`, an
`AiidaDeprecationWarning` is issued.
If any parallelization option is already specified in the
`settings['CMDLINE']`, a `AiidaDeprecationWarning` is issued.

The `BasePwCpInputGenerator` is given three class attributes:
- `_PARALLELIZATION_FLAGS` is a mapping `{flag_name: help}`
   of all possible flags.
- `_ENABLED_PARALLELIZATION_FLAGS` is a tuple of flag names that
  are implemented in a particular code.
- `_PARALLELIZATION_FLAG_ALIASES` is used to detect all possible
  variations on a flag name. These are taken from the QE source.

When checking for existing parallelization flags in the manually
passed cmdline parameters, these are normalized by splitting on
whitespace.
QE ignores flags that are capitalized differently, so we do not
have to normalize capitalization here.

Currently, `_ENABLED_PARALLELIZATION_FLAGS` is set only for PW
calculations. Technically it accepts the `-nimage` flag, but we
disallow it because (AFAIK) it doesn't make any sense - it just
repeats the same calculation.

Co-authored-by: Sebastiaan Huber <mail@sphuber.net>
@greschd
Copy link
Member Author

greschd commented Nov 20, 2020

For the documentation, I just added a description to the Inputs section in pw.rst - since this isn't all that complicated, I don't think it needs a dedicated section.

Without the `__init__.py`, `pylint` thinks that `types.py` is a
stand-alone module, which would shadow the `types` standard
library module.
@greschd
Copy link
Member Author

greschd commented Nov 20, 2020

Had to add an empty __init__.py to aiida_quantumespresso/common; otherwise types.py can shadow the standard library types module - at least in the eyes of pylint.

@sphuber sphuber merged commit 4fabecc into aiidateam:develop Nov 23, 2020
@sphuber
Copy link
Contributor

sphuber commented Nov 23, 2020

THanks a lot @greschd

@greschd greschd deleted the parallelization_inputs branch November 23, 2020 14:52
@greschd
Copy link
Member Author

greschd commented Nov 23, 2020

Thanks for reviewing @sphuber!

@greschd
Copy link
Member Author

greschd commented Nov 24, 2020

@sphuber I just noticed the squashed commit message for this PR is incorrect / outdated.

    `PwCalculation`: add explicit `parallelization` input port (#554)
    
    This new input port accepts a dictionary that allows one to specify the
    parallelization flags that `pw.x` accepts. This supersedes the old method
    of defining the flags in the `settings `input node. The latter is
    deprecated but still takes precedence if the same flag is specified
    through both input ports.

Specifically, we raise "if the same flag is specified through both input ports.". Just FYI, not sure if it makes sense to go back and modify the commit message - but we should probably make sure it's correct in an eventual changelog.

@sphuber
Copy link
Contributor

sphuber commented Nov 25, 2020

So the only incorrect part is that it doesn't silently take the settings value in case of conflict but it actually raises? I think it is not that bad of a mistake and given that it has been in develop for a while I don't think I should correct the message and force push now

@greschd
Copy link
Member Author

greschd commented Nov 25, 2020

Yeah, makes sense - force-pushing to develop seems like a bit of a drastic measure. At the end of the day, there's always the possibility that old commit messages are now outdated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Promote parallelization settings to "proper" inputs?
3 participants